Constructing LZ78 Tries and Position Heaps in Linear Time for Large Alphabets

نویسندگان

  • Yuto Nakashima
  • Tomohiro I
  • Shunsuke Inenaga
  • Hideo Bannai
  • Masayuki Takeda
چکیده

We present the first worst-case linear-time algorithm to compute the Lempel-Ziv 78 factorization of a given string over an integer alphabet. Our algorithm is based on nearest marked ancestor queries on the suffix tree of the given string. We also show that the same technique can be used to construct the position heap of a set of strings in worst-case linear time, when the set of strings is given as a trie.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Space Efficient Linear Time Lempel-Ziv Factorization on Constant~Size~Alphabets

We present a new algorithm for computing the Lempel-Ziv Factorization (LZ77) of a given string of length N in linear time, that utilizes only N logN+O(1) bits of working space, i.e., a single integer array, for constant size integer alphabets. This greatly improves the previous best space requirement for linear time LZ77 factorization (Kärkkäinen et al. CPM 2013), which requires two integer arr...

متن کامل

Suux Trees for Integer Alphabets Revisited

Farach recently gave a linear-time algorithm for constructing suux trees for integer alphabets, which solves a major open problem on index data structures. We present a new and somewhat cleaner algorithm for constructing suux trees for integer alphabets in linear time.

متن کامل

Constacyclic Codes over Group Ring (Zq[v])/G

Recently, codes over some special finite rings especially chain rings have been studied. More recently, codes over finite non-chain rings have been also considered. Study on codes over such rings or rings in general is motivated by the existence of some special maps called Gray maps whose images give codes over fields. Quantum error-correcting (QEC) codes play a crucial role in protecting quantum ...

متن کامل

On-Line Construction of Position Heaps

We propose a simple linear-time on-line algorithm for constructing a position heap for a string [EMOW11]. Our definition of position heap differs slightly from the one proposed in [EMOW11] in that it considers the suffixes ordered in the descending order of length. Our construction is based on classic suffix pointers and resembles Ukkonen’s algorithm for suffix trees [Ukk95]. Using suffix point...

متن کامل

Approximation of Grammar-Based Compression via Recompression

In this paper we present a simple linear-time algorithm constructing a context-free grammar of size O(g log(N/g)) for the input string, where N is the size of the input string and g the size of the optimal grammar generating this string. The algorithm works for arbitrary size alphabets, but the running time is linear assuming that the alphabet Σ of the input string can be identified with number...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Inf. Process. Lett.

دوره 115  شماره 

صفحات  -

تاریخ انتشار 2015